Hard Attention VS Soft Attention
- Hard Attention
- Chose Attention spots (or locations) using Stochastical methods
- Use Reinforcement Learning
- e.g. Recurrent Visual Models of Visual Attention
- Soft Attention
- Consider All spot with their weights(Using weights means Attention)
- Guarantee Differentiability of Model => Traditional Back Propagation
- e.g. Neural Machine Translation By Jointly Learning to Align and Translate
- References